460 research outputs found

    Large-Scale Music Genre Analysis and Classification Using Machine Learning with Apache Spark

    Get PDF
    The trend for listening to music online has greatly increased over the past decade due to the number of online musical tracks. The large music databases of music libraries that are provided by online music content distribution vendors make music streaming and downloading services more accessible to the end-user. It is essential to classify similar types of songs with an appropriate tag or index (genre) to present similar songs in a convenient way to the end-user. As the trend of online music listening continues to increase, developing multiple machine learning models to classify music genres has become a main area of research. In this research paper, a popular music dataset GTZAN which contains ten music genres is analysed to study various types of music features and audio signals. Multiple scalable machine learning algorithms supported by Apache Spark, including naïve Bayes, decision tree, logistic regression, and random forest, are investigated for the classification of music genres. The performance of these classifiers is compared, and the random forest performs as the best classifier for the classification of music genres. Apache Spark is used in this paper to reduce the computation time for machine learning predictions with no computational cost, as it focuses on parallel computation. The present work also demonstrates that the perfect combination of Apache Spark and machine learning algorithms reduces the scalability problem of the computation of machine learning predictions. Moreover, different hyperparameters of the random forest classifier are optimized to increase the performance efficiency of the classifier in the domain of music genre classification. The experimental outcome shows that the developed random forest classifier can establish a high level of performance accuracy, especially for the mislabelled, distorted GTZAN dataset. This classifier has outperformed other machine learning classifiers supported by Apache Spark in the present work. The random forest classifier manages to achieve 90% accuracy for music genre classification compared to other work in the same domain

    Novel online Recommendation algorithm for Massive Open Online Courses (NoR-MOOCs)

    Get PDF
    Massive Open Online Courses (MOOCs) have gained in popularity over the last few years. The space of online learning resources has been increasing exponentially and has created a problem of information overload. To overcome this problem, recommender systems that can recommend learning resources to users according to their interests have been proposed. MOOCs contain a huge amount of data with the quantity of data increasing as new learners register. Traditional recommendation techniques suffer from scalability, sparsity and cold start problems resulting in poor quality recommendations. Furthermore, they cannot accommodate the incremental update of the model with the arrival of new data making them unsuitable for MOOCs dynamic environment. From this line of research, we propose a novel online recommender system, namely NoR-MOOCs, that is accurate, scales well with the data and moreover overcomes previously recorded problems with recommender systems. Through extensive experiments conducted over the COCO data-set, we have shown empirically that NoR-MOOCs significantly outperforms traditional KMeans and Collaborative Filtering algorithms in terms of predictive and classification accuracy metrics

    A novel DeepMaskNet model for face mask detection and masked facial recognition

    Get PDF
    Coronavirus disease (COVID-19) has significantly affected the daily life activities of people globally. To prevent the spread of COVID-19, the World Health Organization has recommended the people to wear face mask in public places. Manual inspection of people for wearing face masks in public places is a challenging task. Moreover, the use of face masks makes the traditional face recognition techniques ineffective, which are typically designed for unveiled faces. Thus, introduces an urgent need to develop a robust system capable of detecting the people not wearing the face masks and recognizing different persons while wearing the face mask. In this paper, we propose a novel DeepMasknet framework capable of both the face mask detection and masked facial recognition. Moreover, presently there is an absence of a unified and diverse dataset that can be used to evaluate both the face mask detection and masked facial recognition. For this purpose, we also developed a largescale and diverse unified mask detection and masked facial recognition (MDMFR) dataset to measure the performance of both the face mask detection and masked facial recognition methods. Experimental results on multiple datasets including the cross-dataset setting show the superiority of our DeepMasknet framework over the contemporary models

    A reinforcement learning recommender system using bi-clustering and Markov Decision Process

    Get PDF
    Collaborative filtering (CF) recommender systems are static in nature and does not adapt well with changing user preferences. User preferences may change after interaction with a system or after buying a product. Conventional CF clustering algorithms only identifies the distribution of patterns and hidden correlations globally. However, the impossibility of discovering local patterns by these algorithms, headed to the popularization of bi-clustering algorithms. Bi-clustering algorithms can analyze all dataset dimensions simultaneously and consequently, discover local patterns that deliver a better understanding of the underlying hidden correlations. In this paper, we modelled the recommendation problem as a sequential decision-making problem using Markov Decision Processes (MDP). To perform state representation for MDP, we first converted user-item votings matrix to a binary matrix. Then we performed bi-clustering on this binary matrix to determine a subset of similar rows and columns. A bi-cluster merging algorithm is designed to merge similar and overlapping bi-clusters. These bi-clusters are then mapped to a squared grid (SG). RL is applied on this SG to determine best policy to give recommendation to users. Start state is determined using Improved Triangle Similarity (ITR similarity measure. Reward function is computed as grid state overlapping in terms of users and items in current and prospective next state. A thorough comparative analysis was conducted, encompassing a diverse array of methodologies, including RL-based, pure Collaborative Filtering (CF), and clustering methods. The results demonstrate that our proposed method outperforms its competitors in terms of precision, recall, and optimal policy learning

    Stock market prediction using machine learning classifiers and social media, news

    Get PDF
    Accurate stock market prediction is of great interest to investors; however, stock markets are driven by volatile factors such as microblogs and news that make it hard to predict stock market index based on merely the historical data. The enormous stock market volatility emphasizes the need to effectively assess the role of external factors in stock prediction. Stock markets can be predicted using machine learning algorithms on information contained in social media and financial news, as this data can change investors’ behavior. In this paper, we use algorithms on social media and financial news data to discover the impact of this data on stock market prediction accuracy for ten subsequent days. For improving performance and quality of predictions, feature selection and spam tweets reduction are performed on the data sets. Moreover, we perform experiments to find such stock markets that are difficult to predict and those that are more influenced by social media and financial news. We compare results of different algorithms to find a consistent classifier. Finally, for achieving maximum prediction accuracy, deep learning is used and some classifiers are ensembled. Our experimental results show that highest prediction accuracies of 80.53% and 75.16% are achieved using social media and financial news, respectively. We also show that New York and Red Hat stock markets are hard to predict, New York and IBM stocks are more influenced by social media, while London and Microsoft stocks by financial news. Random forest classifier is found to be consistent and highest accuracy of 83.22% is achieved by its ensemble

    Identifying Users with Wearable Sensors based on Activity Patterns

    Get PDF
    We live in a world where ubiquitous systems surround us in the form of automated homes, smart appliances and wearable devices. These ubiquitous systems not only enhance productivity but can also provide assistance given a variety of different scenarios. However, these systems are vulnerable to the risk of unauthorized access, hence the ability to authenticate the end-user seamlessly and securely is important. This paper presents an approach for user identification given the physical activity patterns captured using on-body wearable sensors, such as accelerometer, gyroscope, and magnetometer. Three machine learning classifiers have been used to discover the activity patterns of users given the data captured from wearable sensors. The recognition results prove that the proposed scheme can effectively recognize a user’s identity based on his/her daily living physical activity patterns

    Modeling user rating preference behavior to improve the performance of the collaborative filtering based recommender systems

    Get PDF
    One of the main concerns for online shopping websites is to provide efficient and customized recommendations to a very large number of users based on their preferences. Collaborative filtering (CF) is the most famous type of recommender system method to provide personalized recommendations to users. CF generates recommendations by identifying clusters of similar users or items from the user-item rating matrix. This cluster of similar users or items is generally identified by using some similarity measurement method. Among numerous proposed similarity measure methods by researchers, the Pearson correlation coefficient (PCC) is a commonly used similarity measure method for CF-based recommender systems. The standard PCC suffers some inherent limitations and ignores user rating preference behavior (RPB). Typically, users have different RPB, where some users may give the same rating to various items without liking the items and some users may tend to give average rating albeit liking the items. Traditional similarity measure methods (including PCC) do not consider this rating pattern of users. In this article, we present a novel similarity measure method to consider user RPB while calculating similarity among users. The proposed similarity measure method state user RPB as a function of user average rating value, and variance or standard deviation. The user RPB is then combined with an improved model of standard PCC to form an improved similarity measure method for CF-based recommender systems. The proposed similarity measure is named as improved PCC weighted with RPB (IPWR). The qualitative and quantitative analysis of the IPWR similarity measure method is performed using five state-of-the-art datasets (i.e. Epinions, MovieLens-100K, MovieLens-1M, CiaoDVD, and MovieTweetings). The IPWR similarity measure method performs better than state-of-the-art similarity measure methods in terms of mean absolute error (MAE), root mean square error (RMSE), precision, recall, and F-measure

    Reconstructing the Engram: Simultaneous, Multisite, Many Single Neuron Recordings

    Get PDF
    AbstractLittle is known about the physiological principles that govern large-scale neuronal interactions in the mammalian brain. Here, we describe an electrophysiological paradigm capable of simultaneously recording the extracellular activity of large populations of single neurons, distributed across multiple cortical and subcortical structures in behaving and anesthetized animals. Up to 100 neurons were simultaneously recorded after 48 microwires were implanted in the brain stem, thalamus, and somatosensory cortex of rats. Overall, 86% of the implanted microwires yielded single neurons, and an average of 2.3 neurons were discriminated per microwire. Our population recordings remained stable for weeks, demonstrating that this method can be employed to investigate the dynamic and distributed neuronal ensemble interactions that underlie processes such as sensory perception, motor control, and sensorimotor learning in freely behaving animals

    How can health systems be strengthened to control and prevent an Ebola outbreak? a narrative review

    Get PDF
    The emergence and re-emergence of infectious diseases are now more than ever considered threats to public health systems. There have been over 20 outbreaks of Ebola in the past 40 years. Only recently, the World Health Organization has declared a public health emergency of international concern (PHEIC) in West Africa, with a projected estimate of 1.2 million deaths expected in the next 6 months. Ebola virus is a highly virulent pathogen, often fatal in humans and non-human primates. Ebola is now a great priority for global health security and often becomes fatal if left untreated. This study employed a narrative review. Three major databases MEDLINE, EMBASE, and Global Health were searched using both ‘text-words’ and ‘thesaurus terms’. Evidence shows that low- and middle-income countries (LMICs) are not coping well with the current challenges of Ebola, not only because they have poor and fragile systems but also because there are poor infectious disease surveillance and response systems in place. The identification of potential cases is problematic, particularly in the aspects of contact tracing, infection control, and prevention, prior to the diagnosis of the case. This review therefore aims to examine whether LMICs’ health systems would be able to control and manage Ebola in future and identifies two key elements of health systems strengthening that are needed to ensure the robustness of the health system to respond effectively
    • …
    corecore